Caio Raphael

Maps characters (text) to byte sequences and vice versa. Used for text fields in serialization.

When someone says "custom package encoding" , they usually mean:
- A framing protocol (how message start/end is delimited).
- A custom serialization/deserialization strategy.
- A binary or textual format for transmitting structures over the network.
Using "encoding" for package framing strategies is technically valid but potentially ambiguous.
In networking, it’s better to use more specific terms.
The word "encoding" itself isn’t wrong but should be interpreted in the technical context.
In Odin, JSON and CBOR are considered "encoding" .

Unicode Transformation Format – 8-bit
Size :
- ASCII characters (0–127) use 1 byte
- Non-ASCII characters use up to 4 bytes
- For languages with many non-ASCII characters (e.g., Chinese, Japanese), it can take more space than UTF-16
Web standard (used by HTML, JSON, XML, etc.)
Backward compatible with ASCII; valid ASCII text is valid UTF-8
Serialization:
- UTF-8 can be considered a form of serialization, specifically for binary text serialization

Size :
- BMP characters (Basic Multilingual Plane, U+0000 to U+FFFF) use 2 bytes
- Characters outside BMP (e.g., emojis, historical scripts) use 4 bytes (surrogate pairs)
- More efficient for languages with many BMP characters (e.g., many Asian languages)
Widely used in some APIs and programming languages (e.g., Java, Windows, .NET)

American Standard Code for Information Interchange
Legacy system compatibility : For old systems or devices that only support ASCII
Simple English text : When text contains only basic characters (A–Z letters, 0–9 digits, basic punctuation)
Simplicity : ASCII uses exactly 1 byte (8 bits) per character, simplifying processing in very basic systems

Is a way to represent arbitrary binary data using only printable ASCII characters.
It is not encryption or compression—just an encoding so binary data can be stored safely in text formats.
It is called Base64 because the encoding uses a numeral system with 64 distinct symbols to represent data.
- Each Base64 character encodes 6 bits.
- So you need exactly 64 symbols ( 2^6 = 64 ) to represent every possible 6-bit value.
Base64 exists as many systems historically handled text only.
Raw binary can contain:
- null bytes (0x00)
- control characters
- non-printable bytes
Converts binary → safe text using only:
```
A–Z a–z 0–9 + /
```
Padding uses = but it is not part of the base.

encoded_size ≈ ceil(input_size / 3) * 4

3 bytes (24 bits) → 4 Base64 characters

3 × 8 = 24 bits
4 × 6 = 24 bits

Example :

Write bytes:

M = 01001101
a = 01100001
n = 01101110

010011010110000101101110

Map to Base64 alphabet:

0–25  → A–Z
26–51 → a–z
52–61 → 0–9
62    → +
63    → /

Output:

19 → T
22 → W
5  → F
46 → u

TWFu